139 research outputs found
A Deep Network Model for Paraphrase Detection in Short Text Messages
This paper is concerned with paraphrase detection. The ability to detect
similar sentences written in natural language is crucial for several
applications, such as text mining, text summarization, plagiarism detection,
authorship authentication and question answering. Given two sentences, the
objective is to detect whether they are semantically identical. An important
insight from this work is that existing paraphrase systems perform well when
applied on clean texts, but they do not necessarily deliver good performance
against noisy texts. Challenges with paraphrase detection on user generated
short texts, such as Twitter, include language irregularity and noise. To cope
with these challenges, we propose a novel deep neural network-based approach
that relies on coarse-grained sentence modeling using a convolutional neural
network and a long short-term memory model, combined with a specific
fine-grained word-level similarity matching model. Our experimental results
show that the proposed approach outperforms existing state-of-the-art
approaches on user-generated noisy social media data, such as Twitter texts,
and achieves highly competitive performance on a cleaner corpus
Consumer-side Fairness in Recommender Systems: A Systematic Survey of Methods and Evaluation
In the current landscape of ever-increasing levels of digitalization, we are
facing major challenges pertaining to scalability. Recommender systems have
become irreplaceable both for helping users navigate the increasing amounts of
data and, conversely, aiding providers in marketing products to interested
users. The growing awareness of discrimination in machine learning methods has
recently motivated both academia and industry to research how fairness can be
ensured in recommender systems. For recommender systems, such issues are well
exemplified by occupation recommendation, where biases in historical data may
lead to recommender systems relating one gender to lower wages or to the
propagation of stereotypes. In particular, consumer-side fairness, which
focuses on mitigating discrimination experienced by users of recommender
systems, has seen a vast number of diverse approaches for addressing different
types of discrimination. The nature of said discrimination depends on the
setting and the applied fairness interpretation, of which there are many
variations. This survey serves as a systematic overview and discussion of the
current research on consumer-side fairness in recommender systems. To that end,
a novel taxonomy based on high-level fairness interpretation is proposed and
used to categorize the research and their proposed fairness evaluation metrics.
Finally, we highlight some suggestions for the future direction of the field.Comment: Draft submitted to Springer (November 2022
A latent model for collaborative filtering
AbstractRecommender systems based on collaborative filtering have received a great deal of interest over the last two decades. In particular, recently proposed methods based on dimensionality reduction techniques and using a symmetrical representation of users and items have shown promising results. Following this line of research, we propose a probabilistic collaborative filtering model that explicitly represents all items and users simultaneously in the model. Experimental results show that the proposed system obtains significantly better results than other collaborative filtering systems (evaluated on the MovieLens data set). Furthermore, the explicit representation of all users and items allows the model to e.g. make group-based recommendations balancing the preferences of the individual users
Prediction Intervals: Split Normal Mixture from Quality-Driven Deep Ensembles
Prediction intervals are a machine- and human-interpretable way to represent
predictive uncertainty in a regression analysis. In this paper, we present a
method for generating prediction intervals along with point estimates from an
ensemble of neural networks. We propose a multi-objective loss function fusing
quality measures related to prediction intervals and point estimates, and a
penalty function, which enforces semantic integrity of the results and
stabilizes the training process of the neural networks. The ensembled
prediction intervals are aggregated as a split normal mixture accounting for
possible multimodality and asymmetricity of the posterior predictive
distribution, and resulting in prediction intervals that capture aleatoric and
epistemic uncertainty. Our results show that both our quality-driven loss
function and our aggregation method contribute to well-calibrated prediction
intervals and point estimates
- …